Yandex Source Code Leak and what it means for SEO pt.1 — Mike King // iPullRank

Mike King, Founder and Managing Director at iPullRank, talks about the Yandex source code leak. Yandex, the world’s fourth largest search engine, suffered a major source code leak when a former employee published parts of the company’s internal repository online. While Yandex isn’t Google, there is a lot SEOs can learn about how a modern search engine is built from reviewing the codebase. Today, Mike discusses the Yandex source code leak and what it means for SEO.

About the speaker

Mike King

iPullRank

Mike is Founder and Managing Director at iPullRank

Part 1Why Mobile Only Indexation is a Problem — Mike King // iPullRank
Part 2How to prepare for Mobile-Only Indexation — Mike King // iPullRank
Part 3 Yandex Source Code Leak and what it means for SEO pt.1 — Mike King // iPullRank

Show Notes

·05:21 - Examining the Yandex source code The Yandex source code leak provides insight into how a search engine operates. By examining this codebase, we can expand our understanding of search engine ranking factors and help to improve our search engine optimization practices. ·08:13 - Analysis of Yandex search engine codebase Within the codebase, there were some missing directories that were referenced. However, information such as the initial weights for a series of ranking factors in their search engine's algorithm was discovered, which can provide a better understanding of modern search. ·12:42 - Yandex's lack of javascript rendering and its impact on relevance Yandex has a beta version of JavaScript rendering for its search engine. This means that Yandex can only be so relevant as a large portion of the web uses React, Angular, and Vue, which requires JavaScript rendering to have a more robust understanding of the web. ·15:00 - Phrase-based indexing comparison between Yandex and Google Google has a limitation of 32 grams in their phrases while Yandex has a limit of 64 grams. Despite Yandex's longer phrase limit, the use of BERT in their indexing process could improve relevance, as queries are turned into embeddings rather than combinations of words. ·18:47 - Exploring neural ranking algorithms in Yandex's codebase Mike is still investigating Yandex's search engine's ranking system and trying to understand how it works. Hes also started a Slack community called "The Index" to invite other technically inclined SEOs to help dig into the code and build wiki-style documentation. ·20:24 - Opportunity to join the Yandex decoding effort Mike is inviting interested SEOs to join a Slack community called "The Index" to contribute to the discussion on the Yandex search engine. The GitHub for the project is called the "Yandex decoder ring" and is publicly accessible via a pull request.

Episode Summary

·"So much of our understanding of SEO is based on things we learned in 2003. But Google is just so much more sophisticated now." -Mike King, Founder, iPullRank ·"Understanding that by explicitly seeing in the Yandex codebase that there were about 18,000, different ranking factors allows us to expand our thinking about what Google might be considering." -Mike King, Founder, iPullRank ·"People were using the AOL leak data from 2006 for a good five years to build CTR models for Google." -Mike King, Founder, iPullRank ·"The way that an index is structured is it's not like you just go to one computer and hit a database, and then get all the documents back. It's distributed across 1000s of computers." -Mike King, Founder, iPullRank

Part 1Why Mobile Only Indexation is a Problem — Mike King // iPullRank
Part 2How to prepare for Mobile-Only Indexation — Mike King // iPullRank
Part 3 Yandex Source Code Leak and what it means for SEO pt.1 — Mike King // iPullRank

About the speaker

Mike King

iPullRank

Mike is Founder and Managing Director at iPullRank

Up Next:

Part 1Why Mobile Only Indexation is a Problem — Mike King // iPullRank

Play Podcast
Part 2How to prepare for Mobile-Only Indexation — Mike King // iPullRank

Play Podcast
Current Podcast

Part 3Yandex Source Code Leak and what it means for SEO pt.1 — Mike King // iPullRank

Mike King, Founder and Managing Director at iPullRank, talks about the Yandex source code leak. Yandex, the world’s fourth largest search engine, suffered a major source code leak when a former employee published parts of the company’s internal repository online. While Yandex isn’t Google, there is a lot SEOs can learn about how a modern search engine is built from reviewing the codebase. Today, Mike discusses the Yandex source code leak and what it means for SEO.

Sponsored By:

Yandex Source Code Leak and what it means for SEO pt.1 — Mike King // iPullRank

Mike King

iPullRank

Show Notes

Episode Summary

Mike King

iPullRank

Up Next:

Part 1Why Mobile Only Indexation is a Problem — Mike King // iPullRank

Part 2How to prepare for Mobile-Only Indexation — Mike King // iPullRank

Part 3Yandex Source Code Leak and what it means for SEO pt.1 — Mike King // iPullRank

Voices of Search

Subscribe & Stream!

Useful links

Meet a few of our sponsors...

Content Categories

Business Class

Marketing Channel

Business Type

Weekly Topic